由于培训和测试数据分布之间的域移动,新的操作条件可能会导致故障诊断模型的大量性能下降。尽管已经提出了几种域的适应方法来克服此类域移位,但如果两个域中表示的故障类别不相同,则其应用是有限的。为了在两个不同的域之间启用训练有素的模型的更好可传递性,尤其是在两个域之间仅共享健康数据类别的设置中,我们提出了一个新的框架,以基于生成不同的故障签名的部分和开放式域适应一个瓦斯林甘。提出的框架的主要贡献是具有两个主要不同特征的受控合成断层数据生成。首先,所提出的方法使目标域中仅能访问目标域中的健康样品和源域中的样本错误,从而在目标域中生成未观察到的故障类型。其次,可以将故障产生控制以精确生成不同的故障类型和故障严重程度。所提出的方法特别适合于极端域的适应设置,这些设置在复杂和安全关键系统的背景下特别相关,其中两个域之间仅共享一个类。我们在两个轴承断层诊断案例研究上评估了部分和开放式域适应任务的拟议框架。我们在不同标签空间设置中进行的实验展示了提出的框架的多功能性。与给定较大域间隙的其他方法相比,提出的方法提供了优越的结果。
translated by 谷歌翻译
铁路是一个复杂的系统,包括多个基础设施和滚动库存资产。为了安全,可靠,有效地操作系统,需要监视许多组件的条件。为了自动化此过程,可以使用数据驱动的故障检测和诊断模型。但是,实际上,如果培训数据集并不代表所有可能的未来条件,则数据驱动模型的性能可能会受到损害。我们建议通过学习特征表示,一方面是对操作或环境因素不变的,但另一方面,对资产的健康状况的变化敏感。我们评估了如何在有监督的和无监督的故障检测和诊断任务上使用对比度学习,并在铁路系统中进行实际状态监控数据集 - 来自基础架构资产的一个图像数据集和来自滚动库存资产的一次时间序列数据集。首先,我们评估了标有标记图像数据集的铁路卧铺缺陷分类任务上有监督的对比功能学习的性能。其次,我们评估了无监督的对比功能学习的性能,而没有在铁路轮数据集的异常检测任务上访问故障样本。在这里,我们检验了特征编码器对降解的敏感性是否对数据中的新故障模式敏感的假设。我们的结果表明,与最先进的方法相比,对比功能学习可以提高有关卧铺的监督分类任务的绩效。此外,在有关铁路轮的异常检测任务上,与最新方法相比,炮击缺陷的检测得到了改善。
translated by 谷歌翻译
高频(HF)信号在工业世界中普遍存在,对于监测工业资产具有很大的用途。大多数深度学习工具都是针对固定和/或非常有限的尺寸的输入和深入学习的许多成功应用,因为输入的工业情境使用作为输入的提取特征,这是手动和通常艰苦地获得原始信号的紧凑型表示。在本文中,我们提出了一个完全无监督的深度学习框架,能够提取原始HF信号的有意义和稀疏表示。我们嵌入了我们的架构的快速离散小波变换(FDWT)的重要属性,如(1)级联算法,(2)将小波,缩放和转换滤波器功能链接在一起的共轭正交过滤器属性,以及(3)系数去噪。使用深度学习,我们使这座架构完全学习:小波基座和小波系数去噪都是可知的。为实现这一目标,我们提出了一种新的激活函数,该激活函数执行小波系数的学习硬阈值。通过我们的框架,Denoising FDWT成为一个完全学习的无监督工具,既不需要任何类型的预处理,也不需要任何关于小波变换的先前知识。我们展示了在在开源声音数据集上执行的三种机器学习任务中嵌入所有这些属性的好处。我们对每个物业对架构的性能的影响进行了消融研究,达到了基线高于基线的结果和其他最先进的方法。
translated by 谷歌翻译
Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.
translated by 谷歌翻译
In order for artificial neural networks to begin accurately mimicking biological ones, they must be able to adapt to new exigencies without forgetting what they have learned from previous training. Lifelong learning approaches to artificial neural networks attempt to strive towards this goal, yet have not progressed far enough to be realistically deployed for natural language processing tasks. The proverbial roadblock of catastrophic forgetting still gate-keeps researchers from an adequate lifelong learning model. While efforts are being made to quell catastrophic forgetting, there is a lack of research that looks into the importance of class ordering when training on new classes for incremental learning. This is surprising as the ordering of "classes" that humans learn is heavily monitored and incredibly important. While heuristics to develop an ideal class order have been researched, this paper examines class ordering as it relates to priming as a scheme for incremental class learning. By examining the connections between various methods of priming found in humans and how those are mimicked yet remain unexplained in life-long machine learning, this paper provides a better understanding of the similarities between our biological systems and the synthetic systems while simultaneously improving current practices to combat catastrophic forgetting. Through the merging of psychological priming practices with class ordering, this paper is able to identify a generalizable method for class ordering in NLP incremental learning tasks that consistently outperforms random class ordering.
translated by 谷歌翻译
Recent work has shown that fine-tuning large pre-trained language models on a collection of tasks described via instructions, a.k.a. instruction-tuning, improves their zero and few-shot generalization to unseen tasks. However, there is a limited understanding of the performance trade-offs of different decisions made during the instruction-tuning process. These decisions include the scale and diversity of the instruction-tuning benchmark, different task sampling strategies, fine-tuning with and without demonstrations, training using specialized datasets for reasoning and dialogue, and finally, the fine-tuning objectives themselves. In this paper, we characterize the effect of instruction-tuning decisions on downstream task performance when scaling both model and benchmark sizes. To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks. Through the lens of this framework, we first present insights about instruction-tuning decisions as applied to OPT-30B and further exploit these insights to train OPT-IML 30B and 175B, which are instruction-tuned versions of OPT. OPT-IML demonstrates all three generalization abilities at both scales on four different evaluation benchmarks with diverse tasks and input formats -- PromptSource, FLAN, Super-NaturalInstructions, and UnifiedSKG. Not only does it significantly outperform OPT on all benchmarks but is also highly competitive with existing models fine-tuned on each specific benchmark. We release OPT-IML at both scales, together with the OPT-IML Bench evaluation framework.
translated by 谷歌翻译
Neural Radiance Fields (NeRFs) are emerging as a ubiquitous scene representation that allows for novel view synthesis. Increasingly, NeRFs will be shareable with other people. Before sharing a NeRF, though, it might be desirable to remove personal information or unsightly objects. Such removal is not easily achieved with the current NeRF editing frameworks. We propose a framework to remove objects from a NeRF representation created from an RGB-D sequence. Our NeRF inpainting method leverages recent work in 2D image inpainting and is guided by a user-provided mask. Our algorithm is underpinned by a confidence based view selection procedure. It chooses which of the individual 2D inpainted images to use in the creation of the NeRF, so that the resulting inpainted NeRF is 3D consistent. We show that our method for NeRF editing is effective for synthesizing plausible inpaintings in a multi-view coherent manner. We validate our approach using a new and still-challenging dataset for the task of NeRF inpainting.
translated by 谷歌翻译
Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization. While some approaches have been adopted to refine representations via auxiliary self-supervised losses while simultaneously learning decision policies, learning compositional representations from hand-designed and context-independent self-supervised losses (multi-view) still adapts relatively slowly to the real world, which contains many non-IID subspaces requiring rapid distribution shift in both time and spatial attention patterns at varying levels of abstraction. In contrast, supervised language model cascades have shown the flexibility to adapt to many diverse manifolds, and hints of self-learning needed for autonomous task transfer. However, to date, transfer methods for language models like few-shot learning and fine-tuning still require human supervision and transfer learning using self-learning methods has been underexplored. We propose a self-supervised loss policy called contrastive distillation which manifests latent variables with high mutual information with both source and target tasks from weights to tokens. We show how this outperforms common methods of transfer learning and suggests a useful design axis of trading off compute for generalizability for online transfer. Contrastive distillation is improved through sampling from memory and suggests a simple algorithm for more efficiently sampling negative examples for contrastive losses than random sampling.
translated by 谷歌翻译
Despite recent success in large language model (LLM) reasoning, LLMs still struggle with hierarchical multi-step reasoning like generating complex programs. In these cases, humans often start with a high-level algorithmic design and implement each part gradually. We introduce Parsel, a framework enabling automatic implementation and validation of complex algorithms with code LLMs, based on hierarchical function descriptions in natural language. Parsel can be used across domains requiring hierarchical reasoning, e.g. code synthesis, theorem proving, and robotic planning. We demonstrate Parsel's capabilities by using it to generate complex programs that cannot currently be automatically implemented from one description and backtranslating Python programs in the APPS dataset. Beyond modeling capabilities, Parsel allows problem-solving with high-level algorithmic designs, benefiting both students and professional programmers.
translated by 谷歌翻译
We study mechanism design with predictions for the obnoxious facility location problem. We present deterministic strategyproof mechanisms that display tradeoffs between robustness and consistency on segments, squares, circles and trees. All these mechanisms are actually group strategyproof, with the exception of the case of squares, where manipulations from coalitions of two agents exist. We prove that these tradeoffs are optimal in the 1-dimensional case.
translated by 谷歌翻译